home *** CD-ROM | disk | FTP | other *** search
- From greyham Thu Oct 28 18:42:35 1993
- Newsgroups: comp.lang.c++,comp.programming.literate
- Subject: An Automatic C++ documentation compilation project.
- Summary: Anyone willing to add C++ support to c2man?
- Keywords: c2man, C, C++, Literate Programming, Documentation
-
- Copyright 1993, 1994 by Graham Stoney.
- This may be freely redistributed or quoted, so long as it's attributed to me.
-
- Writing and maintaining documentation has often been a thorn in the side of the
- Software Engineer and Programmer. After spending a great deal of time and
- effort writing documentation about a program or software system, the code
- invariably changes, quickly rendering the documentation out of date. The
- documentation becomes misleading, gets neglected, and quickly becomes useless.
-
- "Literate Programming" is one approach to solving this problem. It effectively
- introduces a whole new (typesetting) language, requires a quite radical shift
- on the part of the "non-literate" programmer and still requires a good deal of
- effort on the part of the programmer[1].
-
- I'd like to suggest a different approach which lies considerably closer to
- more traditional programming practices, and can offer quite immediate benefits
- when functional interface documentation is the main documentation required.
-
- The primary philosophy here is to use the programming language as far as
- possible to express the programmer's intentions, and to use comments only when
- the programming language is not sufficiently expressive. A comment can then
- become part of the language grammar which is recognised by a "documentation
- compiler". This tool parses a superset of the programming language and can
- automatically generate documentation in human-readable form by associating the
- programmer's comments with the objects in the code by their context.
-
- Whilst the idea of extracting documentation from comments in source code is by
- no means new, the difference here is that the comments actually form part of
- the grammar of the language recognised by the documentation compiler[2].
-
- Comments should not repeat information that is already represented in the
- program code; for instance, a comment describing a function argument should not
- repeat the name and type of that argument (since that information has already
- been included, for the compiler), but should appear near the argument.
-
- For example, in C, the programmer should write this:
-
- /* include an example in the article */
- enum Result example(int page /* page it appears on */);
-
- Rather than this:
-
- /* include an example in the article
- *
- * PARAMETERS:
- * int page page it appears on
- *
- * RETURNS:
- * RESULT_YES The readers agreed
- * RESULT_NO The readers disagreed
- * RESULT_YOURE_JOKING The readers disagreed strongly
- * RESULT_BLANK_LOOKS The readers didn't understand
- */
- enum Result example(int page);
-
-
- Also in this example, the documentation compiler knows the possible enumerated
- values that the function can return (as does the "real" compiler), so it is
- unnecessary for the programmer to restate them. The comments need simply be
- included in the definition for "enum Result" for the "RETURNS" information to
- be generated automatically:
-
- enum Result {
- RESULT_YES, /* The readers agreed */
- RESULT_NO, /* The readers disagreed */
- RESULT_YOURE_JOKING, /* The readers disagreed strongly */
- RESULT_BLANK_LOOKS /* The readers didn't understand */
- };
-
- Critics have suggested that the latter style in the example is easier to read
- for someone wishing to call the function in question. Of course, this is a
- style question which depends on each person's tastes; but the criticism is tied
- to the notion that the source code needs to look "beautiful" because it is the
- primary reference for someone wishing to use that function. This becomes much
- less significant once documentation is available which is known to _always_ be
- up to date. Of course, the latter style takes longer to write and maintain,
- and can become out of date should the name or type of the parameter be
- changed, yet the comment get neglected.
-
- I have implemented one such documentation compiler for the C language called
- "c2man", which is freely available[3]. The response from users has been
- extremely encouraging; I suspect this is partly because of the wide variety of
- styles of comment placement that are recognised: it often correctly recognises
- comments that weren't written with c2man in mind at all. While it's use is
- focused solely on functional interface documentation and it doesn't have
- anywhere near the power of a full Literate Programming system, the focus is on
- reducing the effort required by the programmer to the absolute minimum, and
- seeing how much documentation we can get essentially "for free".
-
- Many people have requested C++ support be added to c2man, and I suspect that
- this philosophy would be even more suitable and powerful for documenting
- interfaces to C++ classes automatically.
-
- Here is an example of how I envisage this philosophy would work when applied to
- C++. It's interesting to note that this code was written a couple of years ago
- exactly as you see it here, without the idea of generating documentation from
- it in mind at all:
-
-
- // generic Timer class
- class Timer
- {
- private:
- static int numactive; // number of constructed timers.
- static Timer *first; // first one in list.
- Timer *next; // next one in linked list.
- Time ticksdiff; // ticks we take to expire once at front.
-
- enum
- {
- INACTIVE, // timer is not in chain.
- STARTED, // one-shot
- RUNNING // continuous.
- } state;
-
- // original interrupt vector value.
- static void interrupt (far *old_vector)(...);
-
- void (*timeout_function)(int); // function called when we time out
- int timeout_parameter; // gets passed to timeout_function
- Time duration; // timer length (ticks)
-
- static void interrupt far tick(...); // clock tick routine.
-
- void insert(); // add into active chain.
- void remove(); // remove from active chain.
- void set(Time milliseconds); // set duration from ms.
-
- public:
- // constructor
- Timer(Time time=0, // milliseconds
- void (*function)(int)=0, // called at timeout
- int param=-1); // param for function
-
- // destructor
- ~Timer();
-
- // start (or restart) a timer running.
- void Start();
- void Start(Time duration); // how long to run for
-
- // start a timer running continuous.
- void Run();
-
- // stop a timer.
- void Stop();
-
- // is a timer active?
- boolean Active() const { return state != INACTIVE; };
- };
-
-
- Processing this class declaration could generate the following automatically:
-
- NAME
- Timer - generic timer class
-
- SYNOPSIS
- class Timer
- {
- public:
- Timer(Time time=0,
- void (*function)(int)=0,
- int param=-1);
- ~Timer();
- void Start();
- void Start(Time duration);
- void Run();
- void Stop();
- boolean Active() const;
- };
-
- PARAMETERS
- Time time
- Milliseconds
-
- void (*function)(int)
- called at timeout.
-
- int param
- Param for function.
-
- Time duration
- How long to run for.
-
- DESCRIPTION
- Timer
- Constructor
-
- ~Timer
- Destructor
-
- Start
- Start (or restart) a timer running.
-
- Run
- Start a timer running continuous.
-
- Stop
- Stop a timer.
-
- Active
- Is a timer active?.
-
-
- It should also be possible to extract this information from the implementation
- of the class (rather than the declaration), if that's where the user prefers to
- put the comments describing each member function and their parameters.
-
- The ideal tool should:
- 1. Avoid imposing a style on the programmer.
- 2. Work out section names (NAME, SYNOPSIS etc) without the programmer having
- to specify them explicitly.
- 3. Handle C++ and C style code equally well.
- 4. Not require the programmer to restate information which is already expressed
- in the syntax of the programming language.
- 5. Work reasonably well with existing code.
- 6. Flatten the class hierarchy so that the documentation for each class
- includes virtually everything the user needs to know about it.
-
- A number of tools already exist which attempt to tackle this problem, such as
- class2man, genman, classdoc and docclass. They vary in sophistication,
- utility, and the demands they place on the programmer; however, none as yet
- meet all the criteria set out above, and no one tool will suit the tastes of
- all programmers.
-
- Pouring lots of effort into a really ``smart'' documentation generator makes
- sense because once it's done, you get a payback for every document you
- generate. Every little feature added to the documentation generator to make
- things easier for the programmer pays off multiple times, and minimising the
- effort required by the programmer is the key.
-
- The logical starting point would be to graft Jim Roskind's C++ grammar[4] into
- c2man, modifying it to recognise comments in the relevant places, and adding
- all the necessary structures to hold the information from the parser that will
- get included in the output. Very little functional change should be needed in
- the lexer, which already recognises C++ comments.
-
- Unfortunately, at present I do not have sufficient spare time to make the
- additions to c2man required to support C++. It would be a great contribution to
- the C++ community, not to mention the documentation time saved by themselves,
- for someone involved in C++ work to add this support and release the result[5].
-
- If you work with a team developing C++ code, please consider having one of your
- developers on a ``Usenet Sabbatical'' to extend this philosophy to C++, and
- start reaping the benefits in documentation time savings.
-
- It could also make an ideal Computer Science student compiler project.
-
- Please contact me via E-mail if you are interested in undertaking such a
- project.
-
-
- Graham Stoney
- greyham@research.canon.oz.au
-
- Footnotes:
- 1. Advocates of Literate Programming would argue that Literate Programming is
- much more than snazzy documents and that it encourages this extra effort to
- focus early on in the design of the software, which pays off later.
-
- 2. To get a better idea, see the file grammar.y in the c2man distribution.
-
- 3. c2man has been posted to comp.sources.misc. It should be available from:
- location: ftp from any comp.sources.misc archive, in volume42
- (the version in the comp.sources.reviewed archive is obsolete)
- ftp /pub/Unix/Util/c2man-2.0.*.tar.gz from dnpap.et.tudelft.nl
- Australia: ftp /usenet/comp.sources.misc/volume42/c2man-2.0/*
- from archie.au
- N.America: ftp /usenet/comp.sources.misc/volume42/c2man-2.0/*
- from ftp.wustl.edu
- Europe: ftp /News/comp.sources.misc/volume42/c2man-2.0/*
- from ftp.irisa.fr
- Japan: ftp /pub/NetNews/comp.sources.misc/volume42/c2man-2.0/*
- from ftp.iij.ad.jp
- Patches: ftp pub/netnews/sources.bugs/volume93/sep/c2man* from lth.se
-
- 4. Jim Roskind's yaccable C++ grammar is available via ftp from
- ics.uci.edu in the ftp/pub directory as:
-
- c++grammar2.0.tar.Z
- byacc1.8.tar.Z
-
- 5. c2man's copyright requires that all derivative works remain freely
- available.
-
-